Here are some common distributions that describe the random fluctuations found in data analyzed by
biostatisticians:
Normal: The familiar, bell-shaped, normal distribution is probably the most common distribution
you will encounter. As an example, systolic blood pressure (SBP) is found to follow a normal
distribution in human populations.
Log-normal: The log-normal distribution is also called a skewed distribution. This distribution
describes many laboratory results, such as enzymes and antibody titers, where most of the
population tests on the low end of the scale. It is also the distribution seen for lengths of hospital
stays, where most stays are 0 or 1 days, and the rest are longer.
Binomial: The binomial distribution describes proportions, and represents the likelihood that a
value will take one of two independent values (as whether an event occurs or does not occur). As
an example, in a class held regularly where students can only pass or fail, the proportion who fail
will follow a binomial distribution.
Poisson: The Poisson distribution describes the number of occurrences of sporadic random events
(rather than the binomial distribution, which is for more common events). Examples of where the
Poisson distribution is used in biostatistics is where the events are not as common, such as deaths
from specific cancers each year.
Chapter 24 describes these and other distribution functions in more detail, and you also encounter them
throughout this book.
Distributions important to statistical testing
Some probability distributions don’t describe fluctuations in data values but instead describe
fluctuations in calculated values as part of a statistical test (when you are calculating what’s called a
test statistic). Distributions of test statistics include the Student t, chi-square, and Fisher F
distributions. Test statistics are used to obtain the p values that result from the tests. See “Getting the
language down” later in this chapter for a definition of p values.
Introducing Statistical Inference
Statistical inference is where you draw conclusions (or infer) about a population based on
estimations from a sample from that population. The challenge posed by statistical inference theory is
to extract real information from the noise in our data. This noise is made up of these random
fluctuations as well as measurement error. This very broad area of statistical theory can be subdivided
into two topics: statistical estimation theory and statistical decision theory.
Statistical estimation theory
Statistical estimation theory focuses how to improve the accuracy and precision of metrics calculated
from samples. It provides methods to estimate how precise your measurements are to the true
population value, and to calculate the range of values from your sample that’s likely to include the true
population value. The following sections review the fundamentals of statistical estimation theory.
Accuracy and precision